Okay, so welcome back everybody to deep learning. Sorry for the slight delay. We can continue
this lecture and today we really want to go towards the deep architectures. So now we're
really going deep here and now we're really looking into the deep architecture. So this
is the lecture where we are really having deep networks. So we're all super excited
to see the deep networks and obviously we start with the early architecture, so we have
rather shallow architectures and then we go deeper and then we go even deeper and then
in the end we will shortly discuss how deep we actually should go and whether there are
other possibilities available to design architectures. Okay, so just the first word on datasets to
that are typically used, we already have seen ImageNet that has a thousand classes, 40 million
images. Sometimes people use for development purposes smaller datasets, there's for example
a CyPhar 10, CyPhar 100 and this has 10 or 100 classes and this has about 50,000 training
samples and 10,000 test samples and it only consists of 32 by 32 images. So the images
only have 32 by 32 pixels and the reason why you want to use something like this is obviously
when you're training your first architectures and so on you want to get an idea of how the
problem behaves. You can start with some small dataset also for debugging and so on. This
is much quicker than if you go ahead and train with 40 million images from scratch. So this
is a quite good dataset to start smaller network architectures, for example pre-training. Okay,
so let's go to the early architectures and you can see the first architecture that we
are considering as more or less a deep architecture is here LNET, this is LNET 5 and it actually
has a convolutional layer, a pooling layer, a convolutional layer, a pooling layer and
then fully connected layers and these fully connected layers you could argue that you
have the first couple of layers that are the feature extraction that try to localize important
information in the image and then in the end you have a classifier and this classifier
is fully connected like a traditional multilayer perceptron and you see here that we are highlighting
the different features. You can see there's bullets, these are the key features of the
network and you see one of them here is orange and it's not just because we wanted to put
some color in our slides to make this slides more exciting, it actually has a meaning and
the meaning is that this is something that is still used. So you can see in LNET it's
essentially the convolution for spatial features that has been introduced and that is still
very actively used. Okay, so this is a foundation for many other architectures and you can see
images essentially showing this LNET architecture to demonstrate the idea of deep learning.
So what else? Well there's AlexNet and AlexNet you can see here in the top figure. You may
wonder why AlexNet is cut into half. First of all the figure is redundant to the top
half and the bottom half they are very similar. Second, this is actually the figure from the
paper so it's actually this way in the paper and nobody knows how the full figure would
look like. Maybe this is, well it's actually the full figure. Unless you build a network
that would probably complete this figure, maybe that would be an appropriate task for
predicting or maybe you could just use Photoshop or something. Why is AlexNet so important?
Well the important is here clearly given because it was the 2012 winner of the ImageNet challenge
that has cut the error into half. So AlexNet was really advancing the state of the art
and you can see that it has many feature maps. You have these convolution pooling and it
has very broad feature maps. In the end you have again these fully connected layers and
actually the reason why there's this split in half of the network is actually because
in AlexNet they used two GPUs and they trained both parts of the network on two separate
graphical processing units and this is why they split the network into half simply because
of memory constraints as eight layers and it was essentially the first net or the most
popular net that introduced this GPU computation for training that made it very efficient and
it's also introducing rectified linear units. It also has overlapping max pooling but that's
not so super commonly used but the rectified linear units are still there and the GPUs
as you've heard already plenty of times they are there and we need them because what we
Presenters
Zugänglich über
Offener Zugang
Dauer
00:00:00 Min
Aufnahmedatum
2018-11-27
Hochgeladen am
2019-04-11 16:51:01
Sprache
en-US
Deep Learning (DL) has attracted much interest in a wide range of applications such as image recognition, speech recognition and artificial intelligence, both from academia and industry. This lecture introduces the core elements of neural networks and deep learning, it comprises:
-
(multilayer) perceptron, backpropagation, fully connected neural networks
-
loss functions and optimization strategies
-
convolutional neural networks (CNNs)
-
activation functions
-
regularization strategies
-
common practices for training and evaluating neural networks
-
visualization of networks and results
-
common architectures, such as LeNet, Alexnet, VGG, GoogleNet
-
recurrent neural networks (RNN, TBPTT, LSTM, GRU)
-
deep reinforcement learning
-
unsupervised learning (autoencoder, RBM, DBM, VAE)
-
generative adversarial networks (GANs)
-
weakly supervised learning
-
applications of deep learning (segmentation, object detection, speech recognition, ...)